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(57) Abstract 

Disclosed is a method and apparatus for aiding foreign language instruction, comprising a language instruction program (50) that runs 
on a multimedia computer (52). The language instruction program uses a story to teach the foreign language by displaying selected frames 
(126) about the story and dialog balloons (136) that include phrases in the foreign language associated with the frames. Translations (128) 
of the phrases are also displayed. As a further aid, a pronunciation guide (122) displays an animated representation of a person's lips as 
the correct enunciation of selected words in the foreign language. 
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FOREIGN LANGUAGE TEACHING AID METHOD AND APPARATUS 

5 Field of the Invention 

The invention generally relates to foreign language teaching aids and, 
more particularly, to an apparatus for and method of aiding the instruction of a foreign 
language using interactive multimedia. 

10 Backeround of the Invention 

With ever-increasing world trade and other global interaction, the 
desirability and benefits of understanding different languages and cultures have perhaps 
never been more apparent. From this stems an increasing interest in foreign language 
teaching aids. 

15 Traditional teaching aids include classroom instruction, flash cards, 

audio cassettes, magazines, and books. Each has their own advantages and 
disadvantages. Classroom instraction provides valuable interaction with instantaneous 
feedback, but requires the student to conform to the classroom schedule and pace. Flash 
cards, magazines and books are relatively inexpensive, but do not provide audible 

20 feedback to the student. With audio tapes, the student may not have access to written 
text. 

More recently, computer software programs have become available for 
teaching foreign languages. The popularity of computer software teaching aids is, in 
large part, due to the proliferation of multimedia computers. Multimedia computers, 
25 which have the ability to combine text, soimd, and graphics, have presented significant 
opportunities for the creation of interactive computer-based teaching aids that cater to 
those wanting a relatively inexpensive, and yet effective, means of independent 
language study. 

One popular computer software program that instructs English speaking 
30 persons on the Japanese language is "Power Japanese," distributed by BayWare, 
Incorporated of Mountain View, California. "Power Japanese" and similar language 
learning programs provide a number of advantages over traditional teaching methods. 
In particular, software-based teaching aids have the capability of combining the audio 
benefits of cassettes with the visual benefits of magazines and books, along with drills 
35 that may be selected based on the progress of the student. A downfall of existing 
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software-based teaching aids is that it is sometimes still difficult to ascertain how to 
correctly pronounce a word or phrase simply by hearing the word or phrase. 

Aside from the particular mediimi used as a teaching aid, another 
challenge in facilitating the learning process is keeping the student interested in the 
5 subject matter being taught. Mangajin, a publication devoted to Japanese pop culture 
and language learning, has attempted to maintain the reader's interest by publishing 
Japanese comic strips along with English translations of the Japanese dialog contained 
in the comic strips. The magazine also has published an American comic strip, i.e., 
"Calvin and Hobbes," with a Japanese translation of the dialog contained therein. 

'0 Despite the progress that has been made, there is still a need for the 

development of foreign language teaching aids that can clearly and effectively 
communicate the pronunciation of words and phrases in unfamiliar languages. In 
contrast to the prior art discussed above, the invention promotes the learning process by 
providing a variety of effective techniques for associating foreign words and phrases 

1 5 with a familiar language, and by adding a pronunciation guide to innovative audiovisual 
teaching and feedback techniques. 

Summary of the Invention 

The invention is an improved method and system of aiding foreign 

20 language instruction using a computer having a processor, a memory, a monitor, and 
one or more speakers. The method comprises the steps of: (a) storing a plurality of 
audiovisual presentations of several words in the foreign language, each audiovisual 
presentation having an audible component that includes a pronunciation of each word in 
the foreign language and a visual component that includes a representation of lips 

25 enunciating the word; (b) selecting a word in the foreign language; (c) retrieving the 
stored audiovisual presentation for the selected word; and (d) displaying the visual 
component of the retrieved audiovisual presentation, including the representation of lips 
enxmciating the selected word, while playing the audible component of the retrieved 
audiovisual presentation. This method enables a user to see lips enunciating the 

30 selected word while hearing the word being spoken, thus aiding the user in learning 
how to pronounce the selected word in the foreign language. 

In one embodiment, a textual representation of the selected word in the 
foreign language is also displayed while displaying the representation of lips 
enunciating the selected word and while playing the audible component of the 

35 audiovisual presentation of the selected word. Seeing the textual representation in 
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conjunction with hearing the selected word and seeing the lips pronouncing the selected 
word further reinforces the user*s learning process. 

In accordance with other aspects of the invention, one embodiment 
fiirther includes displaying in a familiar language a verbatim translation of the selected 
5 word in the foreign language. In yet another embodiment, the method displays a 
verbatim translation of a word related to the selected word to provide a related usage 
example. 

In accordance with yet other aspects of the invention, one embodiment 
further displays a dialog balloon that includes a phrase of words in the foreign language 

10 which relates to a portion of an audiovisual story. The dialog balloon is displayed 
while the speech associated with the phrase is played. In addition, another embodiment 
displays a colloquial translation in the familiar language of the foreign language phrase 
displayed in the dialog balloon. 

In accordance with yet other aspects of the invention, the method and 

1 5 system further provides an audiovisual story with a sequence of video frames and audio 
segments of phrases of words in the foreign language. In addition, one embodiment 
provides a continuous play mode to display the video frames and audio segments in the 
story sequence, so that the audio soundtrack is heard continuously instead of as selected 
audio segments. In yet another embodiment, the method and system displays a list of 

20 words present in the audio segments and plays an audio segment of the story that 
contains a selected word, while displaying the video frame associated with that audio 
segment. In another embodiment, the method and system displays a role of one of the 
characters in the audiovisual story. 

2S Brief Descriptinn of the Drawings 

The foregoing aspects and advantages of the invention vnll become more 
readily appreciated as the invention becomes better understood by reference to the 
following detailed description, when taken in conjunction with the accompanying 
drawings, wherein: 

30 FIGURE I is a block diagram depicting a language instruction program 

for use with a multimedia computer in accordance with the invention; 

FIGURE 2 is a flow diagram illustrating the steps taken by a program 
developer in creating a preferred embodiment of the language instruction program; 

FIGURES 3A-3C are pictorial representations depicting start-up screens 
35 of an embodiment of the language instruction program; 
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FIGURES 4A-4B are pictorial representations depicting the 
pronunciation guide and various display and control panels for use in operating the 
language instruction program; 

FIGURE 5 is a pictorial representation showing the translation of a 
selected word into a familiar language; 

FIGURES 6A-6L are pictorial representations depicting sequential 
operation of the pronunciation guide in accordance with the invention; 

FIGURE 7 is a pictorial representation of the dictionary mode of the 
language instruction program in accordance with the invention; 

FIGURE 8 is a pictorial representation of the "cast of characters" mode 
of the language instruction program in accordance with the invention; 

FIGURE 9 is a pictorial representation of the episode data structure used 
by the language instruction program in accordance with the invention; 

FIGURE 10 is a pictorial representation of the dictionary data structure 
1 5 used by the language instruction program in accordance with the invention; 

FIGURE 1 1 is a flow diagram of an exemplary routine for manipulating 
the window color palette in accordance with the invention; 

FIGURE 12 illustrates the relationship between the segments and frames 
comprising an episode of the language instruction program in accordance with the 
20 invention; 

FIGURE 13 illustrates an offset feanire that is used when the language 
instruction program is played in continuous mode in accordance with the invention; and 

FIGURE 14 is a flow diagram of an example routine for implementing 
the continuous play mode of the invention. 

25 

Detailed DcsmnTion of the Prt-Wpd FmhoH;nn,.nt 

FIGURE 1 illustrates a computer software language instruction program 
50 that may run on a multimedia computer 52 for use in teaching a foreign language in 
accordance with the invention. A multimedia computer is generally defined as a 
30 computer having the ability to combine sound, graphics, animation and video. For 
purposes of this disclosure and the claims, the term "foreign language" refers to any 
unfamiliar spoken language in which a person has an interest in learning or 
investigating, and is not meant to refer to nationality. A language that the person does 
have understanding of, or is fluent in, is termed a "familiar language. " 

The multimedia computer 52 typically includes a processing unit 54 that 
is controlled by an operating system 56, memory 58 connected to the processing unit. 
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one or more data or instruction input devices such as a keyboard 55 and a pointing 
device 57, a video display 59, and one or more internal or external speakers 60. The 
pointing device 57 may be a computer mouse, a track ball, or other device that provides 
cursor control. The memory 58 generally comprises, for example, random access 
5 memory (RAM), read only memory (ROM), magnetic storage media, such as a hard 
drive, floppy disk, or magnetic tape, and optical storage media, such as a CD-ROM. A 
graphical user interface (GUI) 62 within the language instruction program 50 interacts 
between the operating system 54 and the internal process application of the language 
instruction program. The multimedia computer 52 may be a commercially available 
10 personal computer, such as a Macintosh™, International Business Machines (IBM)™, 
or IBM-compatible personal computer. When used with IBM and IBM-compatible 
personal computers, the operating system 54 may incorporate a windowing environment 
such as Microsoft Windows® or OS/2™. 

FIGURE 2 illustrates a series of steps that describe the process that a 
15 program developer may go through when creating a preferred embodiment of the 
language instruction program. The language instruction program preferably 
incorporates a storyline, an animated pronunciation guide, and translation features to 
provide an entertaining and informative foreign language aid. The depicted steps are 
shown to assist in describing the invention, and are not steps carried out by the language 
20 instruction program itself. , 
At block 70, a story is selected for use in creating an embodiment or 
version of the language instruction program. In a version of the language instruction 
program described below, it is assumed for clarity in this discussion that the story is 
adapted from an episode of the popular American television series "Murder She Wrote." 
25 In this version of the language instruction program, the program is designed to assist 
Japanese reading/speaking persons in learning the English language. It will be 
appreciated by those skilled in the art, however, that the story may stem from other 
sources, such as other television programs, books, or movies, and that the particular 
language being taught is not germane to the invention. 
30 At block 72, individual pictorial frames from the story, e.g., a "Murder 

She Wrote" episode, are selected. The number and nature of the frames selected will 
depend upon the particular story used, the length desired for a particular version of the 
language instruction program, and the amount of text that a program developer wishes 
to incorporate into that version. After selection of a story and the desired frames, dialog 
35 balloons are created, each including a phrase in the foreign language relating to one or 
more of the selected frames, as indicated at block 74. At block 76, the phrases in the 
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dialog balloons are translated into the familiar language. This translation is a colloquial 
translation in that the resultant text is not necessarily verbatim, but a representation of 
each phrase that is appropriate for the familiar language. 

At block 78, a verbatim translation is made of the individual words in 
5 each phrase. This translation typically includes the familiar language dictionary 
definitions of the foreign language words. In addition, translations of phrases in which 
the word is found, or of words that are related to the word being translated, may be 
included. As an example, a translation of the word "embarrassed" might also include 
translations of: "embarrass" and "to be embarrassed." Thus, in some instances, the 

1 0 process results in a group of related words in the foreign langxiage, with their familiar 
language translations placed nearby. 

At block 80, a pronunciation guide is created for the individual words in 
each phrase. The pronunciation guide, described further below, is an animated or video 
representation of a person's lips correctly enunciating the individual word. In one 

15 embodiment, the pronunciation guide is created by videotaping a person's mouth as the 
person pronounces each individual word, digitizing the videotaped information, and 
then linking the digitized information to each word for subsequent recall. The 
pronunciation guide provides a significant advantage over prior teaching aids because it 
allows a viewer to see the appropriate movements of a mouth as a word is spoken. 

20 Those skilled in the art will appreciate that other animation techniques may also be used 
to accomplish this goal. At block 82, the program developer uses the information 
gathered, created, and stored in blocks 70-80 to create a version of the language 
instruction program. 

A more in-depth understanding of the language instruction program 50 

25 may be acquired by the following screen shots taken fi^om a prototype of the language 
instruction program. With reference to FIGURE 3 A, at start-up a dialog box 100 
indicates to a viewer that a version of the language instruction program on CD-ROM, 
titled "Murder She Wrote," has been detected in the CD-ROM drive of the multimedia 
computer. At this point, the viewer may use the computer pointing device to: select 

30 "OK" at dialog box 102 to continue; select a Japanese phrase indicating that another 
CD-ROM may be inserted at dialog box 104; or quit the program by selecting dialog 
box 106. 

FIGURE 3B illustrates a subsequent screen shot in which the viewer is 
prompted to use the computer keyboard to enter his or her name, shown in dialog box 
35 108. After the viewer's name has been entered, dialog box 102 may be selected to 
continue, or the viewer can exit the program by selecting dialog box 106. Upon 
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continuing, the screen shot shown in FIGURE 3C appears, where the viewer may either 
start from the beginning of the episode by selecting box 102, or resume from a point in 
the episode at which the viewer quit in a prior session by selecting a disdog box 1 10. 

FIGURE 4A illustrates a display box 120 of a frame in the "Murder She 
5 Wrote" episode. For ease of description, the display box 120 may be broken into six 
components: a pronunciation guide 122, located in the upper left comer; a control panel 
124 having discrete icons or selection areas, located in the lower left comer; a frame 
display 126, located in the upper right comer, including a dialog balloon 136; a 
translation window 128 (currently blank), located in the lower right comer; a control bar 

10 130, located between the frame display 126 and the translation window 128; and a 
message bar 132, located just below the translation window 128. Each of the six 
components contained in the display box 120 is described further below. 

The following is a row-by-row explanation of the control icons/ selection 
areas in the control panel 124: 

15 ROW 1: Lip Icon 150 — selecting, e.g., using the computer keyboard 

or pointing device, the lip icon, or anywhere on the pronunciation guide 122 itself, after 
having highlighted a word by use of the graphical user interface results in a display of 
an animated enunciation of the word in the pronunciation gxiide area of the display box. 
The word is simultaneously played over the speakers. 

20 Status Window 152 — displays the current frame number/ 

total number of frames in a given version of the language instruction program. 

Ear Icon 154 — select to play or repeat the phrase in the 

dialog balloon. 

ROW 2: Back Arrow 1 56 — go back to a previous frame. 
25 Forward Arrow 158 — move ahead to the next frame. 

ROW 3: Control Icons 160 — provide, from left to right, a means to 
go to the beginning of the story, to rewind back a set number of frames, e.g., ten frames, 
to fast forward a set number of frames, and to proceed to the end of the story. The 
functions in the back/forward and control icons may also be performed using the control 
30 bar 130. 

ROW 4: Start Auto-Play Icon 162 and Stop Auto-Play Icon 164- 
these allow playing of the story in a "continuous mode" in which the soundtrack from 
the story is played at normal speed while the corresponding frames are displayed. 

ROW 5: Icons 166 — provide a toggle between normal and slow 
35 speech. In slow speech mode, the audible portion of each phrase in the dialog balloon is 
stated more slowly in order to better ascertain what is being said. While these could be 
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used to control the speed of the pronunciation gxiide playback, in a preferred mode the 
audiovisual playback of the pronunciation guide is already at a relatively slow speed, 
and thus is not affected. 

ROW 6: Balloon Icons 168 — provide a toggle between a normal 
5 mode in which the dialog balloon 134 is shown, and a hidden mode in which the dialog 
balloon is hidden. 

ROW 7: "Cast of Characters" Icon 170 - select to display a screen 
having a picture of each character from the current story. 

ROW 8: "Dictionary Mode" Icon 172 select to display an 
1 0 alphabetical listing of each word contained in the story, beginning with a highlighted 
word from the dialog balloon 134, if there is one. The definitions of these words are 
displayed in the translation window 128. 

With continued reference to FIGURE 4 A, the frame display 126 depicts 
the current frame in the episode. In FIGURE 4A, the frame display illustrates a 

15 building 134 and a dialog balloon 136. The dialog balloon 136 displays text from a 
conversation carried on in building 134. It is noted that the text from the dialog 
balloons throughout the episode is printed in the foreign language being taught, in this 
case English. As each dialog balloon 136 appears, the phrase contained in the dialog 
balloon 136 is played over the speaker(s). 

2^ In FIGURE 4B, the viewer has instructed the program to display a 

colloquial translation in Japanese (the familiar language) of the phrase found in dialog 
balloon 136. In one embodiment, the colloquial translation is displayed by selection of 
the tail of the dialog balloon 136 itself, such as by manipulation and triggering or 
actuation of a cursor controller. Also in FIGURE 4B, the message bar 132 has been 

25 changed to indicate that the audio portion of the phrase in the dialog balloon 136 may 
be repeated by clicking on the dialog balloon. 

FIGURE 5 depicts the frame from FIGURE 4B, but wherein a viewer 
has selected the word "interested" from the phrase in the dialog balloon 136, such as by 
manipulation and triggering of a cursor controller. As a result, the word "interested" is 

30 highlighted within the dialog balloon and the familiar language (Japanese) dictionary 
definition of the word is displayed in the translation window 128, Dependent upon the 
particular embodiment of the language instruction program, other information about a 
selected word may also be displayed in the translation window. In this example, the 
Japanese definition of the phrase "am not interested" is also displayed in the translation 

35 window 128, 
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FIGURES 6A-6L illustrate the operation of the pronunciation guide 122. 
The information displayed in FIGURES 6A is identical to that of FIGURE 5, except 
that the lips icon in the pronunciation guide shown in FIGURE 5 has been replaced by a 
digitized display of a person's lips. The remaining FIGURES 6B-6L show only the 
5 contents of the pronimciation guide 122. The pronunciation guide is invoked by 
highlighting a word in the dialog balloon that a user wishes to both hear and see 
enunciated and then selecting the pronimciation icon 150, contained in the upper left 
comer of the control panel 124. The highlighted word will be simultaneously heard 
from the speakers and displayed in pronimciation guide 122. 
10 The sequential illustrations in FIGURES 6A-6L attempt to show the 

sequential animation of a person speaking the word "interested." In FIGURE 6 A, the 
speaker shown in the dialog balloon has not yet begun to pronounce the word. In 
FIGURES 6B-6E, the speaker is pronoimcing the "in" portion of the word; in FIGURES 
6F-6G the speaker is pronouncing the "ter" portion of the word; and in FIGURE 6H-6L 
15 the speaker is pronouncing the "ested" or remainder of the word. In the actual language 
instruction program, the enunciation of the entire word is animated. The clips shown 
FIGURE 6A-6L are to provide further understanding of the invention- 

Both hearing and seeing a word as it is being pronounced greatly 
enhances the learning process. The moving lips are readily visible and the word may be 
20 repeated as often as necessary. Preferably, the lips are displayed in a window much 
smaller than one-half the total display area so as not to interfere with other portions of 
the display. It also is preferred that essentially only the lips be shown, without other 
facial features that could cause a distraction, and that the lips themselves be colored or 
darkened in contrast to the surrounding backgroimd. 
25 FIGURE 7 illustrates the dictionary mode of the language instruction 

program which is achieved by icon 172 in the control panel. In the example of 
FIGURE 7, the letter "I" was highlighted prior to entering the dictionary mode. The 
translated dictionary definition of "I" is displayed in the translation window 128. 
Further, a small dictionary window 1 80 appears, showing an alphabetical listing of the 
30 ' words in the dictionary following the letter "I." An "OK" button 182 allows a viewer to 
exit the dictionary mode. It should be noted that the statistics window 152 is revised in 
the dictionary mode to indicate that this particular "I" is the first occurrence of 48 total 
occurrences in the story. Further, the back and forward arrows 156 and 158 may now 
be used to go to previous and subsequent examples, respectively, of the highlighted 
35 word. This feature allows a user to easily observe different occurrences of the same 
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15 



word in the story to gain a better understanding of that word in the context of various 
sentences. 

FIGURE 8 illustrates the cast of characters mode of the language 
instruction program achieved by selecting icon 170 in the control panel. Using this 
mode, a user may select any picture to get a description of that character A pair of 
windows 182 and 184 provide descriptions of the television series and episode, 
respectively, of the current version of die language instruction program. An exit button 
1 86 allows a user to return to the main menu, shown in FIGURE 4A. 

The above description is primarily directed toward the user interface 
aspects of the invention. The following describes some programming aspects of the 
invention, including two primary databases and other details. In one embodiment of the 
language instruction program, the language instruction program is written in C++ using 
Boriand Object Windows. A low-level audio interface for windows (.WAV) is used for 
the audio portion of the language instruction program, except for the pronunciation 
guide, which utilizes Microsoft's Multimedia Command Interface (MCI) interface 
having the (.AVI) format. This embodiment of the language instruction program 
includes a number of C-K+ modules. Representative modules are listed below: 



Module 1 
Module 2 
Module 3 

Module 4 
Module 5 
Module 6 
Module 7 
Module 8 
Module 9 
Module 10 
Module 1 1 
Module 12 
Module 13 
Module 14 
Module 15 
Module 16 
Module 17 



Program entry and initialization 

Creation and management of top level windows 

Manages program introduction (theme and opening 

dialogs) 

MCI routines for pronimciation guide 
Dictionary database ^ 
Implements balloon edit mode 
Implements icon/ selection area behavior 
Implements the character screen behavior 
Compiles the episode and dictionary data structures 
Displays frame bit maps and sets window's palette 
Manages pronunciation guide window 
Manages memory allocation 
Manages the scroll bar behavior 
Manages the status line information 
Implements the control panel 
Manages the translation window 
Includes an error and message utility 
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Module 1 8 — Manages the viewing of frames and the playing of sound 

in the frame display 
Module 19 — Includes low-Ievei audio routines for sound 



One skilled in the art will recognize that other embodiments that include 
other languages and data foraiats can be utilized to implement this invention. Also, 
different code arrangements and module groupings can be utilized. 
5 The language instruction program can be implemented using two 

primary data structures: an episode data structure 198, shown in FIGURE 9; and a 
dictionary data structure, shown in FIGURE 10. With reference to FIGURE 9, an 
episode 200 comprises a linked list of nodes or elements. A plurality of frame elements 
202 are at the top of the linked list and contain the individual pictorial frames from the 

10 episode 200, as described in block 72 of FIGURE 2 and accompanying text. Each 
frame element 202 is linj^^o either one or two adjacent frame elements 202, shown by 
arrows 204 and 206. EaclK&ame element 202 contains one or more segments 208 of the 
audio for the episode that correspond to that frame. The segments may include multiple 
sentences, and are stored in an audio file. The segment^S« also linked to one another, 

15 indicated by the arrows 207 and 209. It is at this level that the text and sound in the 
segment, each segment being associated with a frame, are manipulated as the viewer 
uses the forward and back buttons of the control panel to peruse the episode. 

At the next level of the database, the text segments from each frame are 
encompassed within a dialog balloon (as described above), the translational string and 

20 physical characteristics of which are included in a number of corresponding dialog 
balloon elements 210. Below the dialog balloon elements 210 are sentence elements 
212. The sentence elements 212 are the textual equivalents of the audio in the segments 
208, broken down sentence by sentence. At the next level, the words comprising each 
sentence in the sentence elements 212 are separated and stored as word elements 214. 

25 A dictionary node 216 is used to link each of the word elements with the 

dictionary database of FIGURE 10. In the example shown in FIGURE 9, a dictionary 
node 216 contains the word "about". Also linked to each word element 214 is a pointer 
indicating other occurrences of that word in the episode 200, shown by blocks 218 and 
220. 

30 FIGURE 10 illustrates a binary tree 230 that can be used to implement 

the dictionary data structure of the langxiage instruction program. At its topmost level, 
the binary tree 230 includes the letters A through Z, referenced by the variable 
"dictionary [0-25]", as shown in blocks 232. Each node of the binary tree 230 contains 
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a key, with words above a certain letter segment added to one subtree and words below 
a certain letter segment added to the other subtree. As shown in FIGURE 10, the level 
1 keys are labeled by reference numeral 234, the level 2 keys by reference numeral 236, 
and the level 3 keys by reference numeral 238. For clarity, subsequent key levels are 
5 not shown. 

As an example of the link between the two data structures, block 240 
illustrates the word "about" and its link to the episode data structure of FIGURE 9 
through blocks 242, 244 and 246, i.e., occurrence 1, occurrence 2, and occurrence 3, 
respectively. 

10 As described above, it is preferable that the pronunciation guide be 

played while a viewer can see the word being pronounced. In the embodiment of the 
invention shown in FIGURE 6A, the pronunciation guide 122 is located in the upper 
left comer and is played while the current frame is simultaneously shown in the frame 
display 126. In one embodiment, the pronunciation guide 122 is stored in a .AVI file 

1 5 and the frame display 126 is stored as a bit map (.BMP file). This embodiment presents 
a progranuning difficulty in some windowing environments, e.g., Microsoft 
Windows®, in that the window palette is typically controlled by only a single entity, 
e.g., an application or a driver running within an application. 

Because the pronunciation guide 122 and the frame display 126 use 

20 different window palettes, window palette conflicts may occur when one of the entities 
is invoked as the other is being displayed. In this context, the term window palette 
conflict defmes a situation that occurs when the color scheme used in the current entity 
changes the color palette, and thereby distorts or skews the color scheme in an adjacent, 
noncontrolling entity. 

25 As an example, assume that the frame display is currently showing the 

picture illustrated in FIGURE 6A. Assume next, upon command from a viewer, that the 
pronunciation guide 122 is invoked. Without an accommodation, the pronunciation 
guide 122, stemming from a .AVI file, will change the window palette to the color 
scheme appropriate for the .AVI file and, as a result, the colors in the frame display 126 

30 will change accordingly. If the color scheme from the .AVI file is different from the 
color scheme of the frame display, the color in the frame display will change, and may 
lead to an undesirable display in the frame display portion of the window. 

FIGURE 1 1 illustrates a solution to the above-described problem. The 
solution includes the assumption that 256 colors are available and being used by the 

35 multimedia computer. Those skilled in the art will appreciate that a different nimiber of 
colors may also be used. At block 270, twenty of the available 256 colors are reserved 
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for the windows system. At block 272, a test is made to deteraiine if the color palette is 
to be changed. The color palette will often change from its previous setting when the 
pronunciation guide is invoked and during frame transitions. For example, the color 
palette will usually be changed between the transition of a frame having an outdoor 
5 scene and a frame having an indoor scene. 

If the color palette is not to be changed, a test is made at block 274 to 
determine if the routine is done, i.e., if the language instruction program is being exited. 
If the language instruction program is not being exited, the routine loops to block 272. 
If the color palette is to be changed, the first 32 colors of the colors remaining in the 
10 color palette are set to black at block 276. This will have the effect of reserving these 
colors for use by the pronunciation guide. At block 278, the remaining 204 colors (256 
less (20+32)) are set to the color scheme of the frame to be displayed. The current 
frame is then displayed using the color scheme at block 280. 

At block 282, a test is made to determine if the pronunciation guide is to 
15 be played, e.g., the viewer has selected the play button. If the pronunciation guide is 
not to be played, the routine loops to block 272. If the pronunciation guide is to be 
played, the animated lips are displayed using the 32 reserved colors only and the sound 
is played over the speakers, shown at block 284. The routine then loops to block 272. 

The segments 208 of the episode data structure shown in FIGURE 9 will 
20 now be described in greater detail. As described above, each segment 208 of an episode 
of the language instruction program comprises a portion of the audio from the episode 
stored in a file, e.g., a "wave" (.WAV) file. Each segment 208 is associated with a 
display portion, e.g., a bit map (.BMP) file, that corresponds to one of the frames 202. 
FIGURE 12 illustrates an exemplary embodiment of the invention wherein the audio 
25 portion, i.e., all of the segments 208 of the episode, is stored as a single wave file 300 
and each frame from the episode is stored as a separate bit map file. Bit map files 302, 
304 and 306, corresponding to Frames 1, 2, and 3, respectively, are shown. 

The wave file 300 is separated into audio portions that correspond to the 
segments 208 by breaks 308. Further, each segment 208 is associated with a frame by 
30 pointers 310. The example in FIGURE 12 indicates that segments #1 and #2 are 
associated with frame 1, segments #3 and #4 with frame 2, and segment #5 with frame 
3. 

During single-play mode operation of the language instruction program, 
a viewer v^ll use the "back" and "forward" arrows 156 and 158 or other icons on the 
35 control panel 124 to control viewing of segments in the episode. When a segment has 
been selected, a new frame may need to be displayed. In that case, the bit map file for 
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that frame is retrieved from the memory, e.g., CD-ROM, processed, and displayed. 
Otherwise, the current frame being displayed remains, although the dialog balloon will 
change to correspond to the segment. In either case, the audio portion associated with 
the segment is retrieved from the wave file and played. The language instruction 
5 program then awaits further commands from the viewer, wherein the process is repeated 
for each segment selection. 

The foregoing retrieval process is sufficient as long as the episode is 
being viewed segment by segment under the viewer's control. However, when the 
language instruction program is being operated in "continuous play mode," the delay 
10 associated with the frame bit map retrieval and processing may cause the audio portion 
of a segment to begin prior to the frame's display. In continuous mode the language 
instruction program will play the entire wave file, with the bit map pointers 310 
controlling the screen display during the playback. To avoid disadvantageous results 
stemming from the retrieval/processing delay described above, an ''offset" may be 

15 associated with each audio segment such that the display portion of a frame is retrieved 
and processed before the audio portion begins to play. 

FIGURE 13 illustrates the use of an offset 312 to begin the frame/bit 
map retrieval process ahead of the audio playback during continuous play mode. 
Basically, the offsets instruct the language instruction program to begin the process of 

20 displaying the next frame a bit sooner than in the single-play mode. Thus, by the time 
the audio portion of the segment begins, the frame information will already be present. 
It is noted that the entire audio track from the wave file is still played, and only the 
timing of the frame displays is changed by the offset. 

FIGURE 14 is a flow diagram illustrating the operation of an exemplary 

25 embodiment of the language instruction program in continuous play mode. At block 
340, the bit map for frame 1 is retrieved, processed, and displayed. At block 342, the 
text from segment 1 is displayed on the monitor, i.e., in a dialog balloon. At block 344, 
the sound from segment 1 is placed in a queue such that it will be played by the 
multimedia computer. At this point, the sound from segment 1 will begin to play, as 

30 indicated by the comment box 346. 

At block 348, the sound portion of segment 2 is placed into the queue. 
At block 350, the variable N is set equal to 2. A test is then made at block 352 to 
determine whether the sound from segment N-1 is finished playing. If the sound from 
segment N-I is not finished playing, the program loops to block 352. If the soimd from 

35 segment N-I is finished playing, the sound from segment N will begin to play, as 
indicated by the comment box 353. A test is then made at block 354 to determine 
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whether the frame is to be changed. This will occur when all of the segments from a 
particular frame have been played, and a new frame in the episode is to be displayed. 

If the frame is to be changed, the new frame is retrieved, processed, and 
displayed, as shown at block 356. Once this is accomplished, or if the frame was not to 
5 be changed, the text of segment N is displayed at block 358. At this point, if there was 
a change made in the frame, an offset to the sound queue may be applied, as discussed 
in FIGURE 13 and accompanying text. This is indicated by comment box 360. 

At block 362, the sound from segment N+1 is placed into the queue. At 
block 364, the variable N is incremented by 1 . A test is made at block 366 as to whether 

10 an exit condition occurs, e.g., a viewer has instructed the program to end. If so, the 
routine temiinates. Otherwise, the routine loops to block 352. 

From the foregoing, it will be appreciated that the language instruction 
program in a preferred embodiment provides a number of advantages. One advantage is 
that, with regard to any particular word in an episode, the language instruction program 

15 can: (1) display the foreign language enunciation of the word by use of the 
pronunciation guide; (2) play the pronunciation of the word over a speaker; (3) play 
each phrase of dialog in which the word is used in the episode; (4) display the word in 
each foreign language context in which it appears in the episode; (5) display the familiar 
language dictionary definitions of that word; (6) display familiar language defmitions of 

20 words that are similar to the word or in phrases in which the word may be contained; 
and (7) display the dictionary listings of the word and words around the word. Each of 
these features help to facilitate the learning of a foreign language by providing a variety 
of associated audio and visual representations of the word, alone and in context. 

While the preferred embodiment of the invention has been illustrated and 

25 described, it will be appreciated that various changes can be made therein without 
departing from the spirit and scop)e of the invention. 
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We claim: 



1. A method in a computer system for aiding foreign language 
instruction, the computer system having a memory, a display device, and a speaker, the 
method comprising the steps of: 

storing in the memory an audiovisual presentation of a plurality of words in a 
foreign langxiage, the audiovisual presentation of each word having 

an audible component that includes a pronunciation of the word in the 

foreign language, and 

a visual component that includes a textual representation of the word in 
the foreign language and a graphical representation of lips enunciating the word; 
selecting a word from among the plurality of words; 
retrieving the stored audiovisual presentation for the selected word; and 
displaying on the display device the visual component of the retrieved 
audiovisual presentation, including the graphical representation of lips and the textual 
representation, while playing the audible component of the retrieved audiovisual presentation 
through the speaker, so that a user can see lips enunciating the selected word while hearing 
the selected word being spoken and while seeing the selected word as written text, thereby 
aiding the user in learning how to pronounce the selected word in the foreign language. 

2. A method in a computer system for aiding foreign language 
mstraction, the computer system having an audiovisual presentation for each of a plurality of 
words in a foreign language, the audiovisual presentation of each word having an audible 
pronunciation, a visual text representation, and a graphical representation of lips enunciating 
the word, the method comprising the steps of: 

selecting a word from among the plurality of words in the foreign language; 

and 

concurrently playing back the audible pronunciation of the selected word 
while displaying the text representation of the selected word and the lips enunciating the 
selected word. 



3. The method of claim 2 wherein the graphical representation of lips 
enunciating each word is constructed from a video animation. 
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4. The method of claim 2, further comprising the step of controlling the 
speed of the lips enunciating the selected word while playing back the audible pronunciation 
of the selected word. 

5. The method of claim 2, further comprising the step of displaying a 
verbatim translation in a familiar language of the selected foreign language word, 

6. The method of claim 5, further comprising the step of, concurrent with 
the displaying of the verbatim translation of the selected foreign language word, displaying a 
verbatim translation in the familiar language of a word that is related to the selected foreign 
language word, thereby providing an example of related usage of the selected foreign 
langxiage word. 

7. The method of claim 2 wherein the plurality of words in the foreign 
language are part of a story. 

8. The method of claim 7, fiirther comprising the steps of: 
displaying a graphical representation of a plurality of characters in the story; 
using the displayed graphical representation, selecting a character; and 
displaying a description of the role of the selected character in the story. 

9. The method of claim 2 wherein the plurality of words in the foreign 
language are part of an audiovisual story, the story having a plurality of video pictures and 
audio portions, each video picture associated with an audio ponion of foreign language 
speech relating to the video picture, and further comprising the steps of: 

selecting a video picture from the plurality of video pictures; 

creating a dialog balloon having foreign language visual text that corresponds 
to the speech of the audio portion associated with the selected video picture; and 

displaying the dialog balloon with the foreign language visual text while 
playing the speech of the audio portion that corresponds to the displayed visual text. 

10. The method of claim 9, wherein the step of selecting the word from 
among the plurality of words in the foreign language selects the word from the visual text 
displayed in the displayed dialog balloon. 

1 1 . The method of claim 9, further comprising the steps of: 

creating in the familiar language a colloquial translation of the foreign 
language visual text displayed in the displayed dialog balloon; and 
displaying the colloquial translation. 
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12. The method of claim 2 wherein the plurality of words in the foreign 
language is part of an audiovisual story the story having a plurality of video frames and audio 
segments, each video frame having an associated audio segment, each audio segment having 
corresponding visual foreign language text, and further comprising the steps of: 

displaying a list of the plurality of words in the foreign language; 
selecting a word from the displayed list; and 

playing an audio segment that contains the selected word while displaying the 
associated video frame and while displaying the foreign language text corresponding to the 
audio segment. 

13. The method of claim 2 wherein the plurality of words in the foreign 
language is part of an audiovisual story, the story having an ordered sequence of a plurality of 
video frames and audio segments, each video frame having an associated audio segment and 
having foreign language text corresponding to the associated audio segment, the plurality of 
audio segments comprising a soundtrack, and further comprising the step of: 

displaying a continuous mode start indicator and a continuous mode stop 

indicator; 

in response to selecting the continuous mode start indicator, displaying the 
video frames in the ordered sequence while playing the associated audio segments so that the 
soundtrack in the foreign language is heard in a continuous manner; and 

in response to selecting the continuous mode stop indicator, 

selectively displaying, in a sequence that in not the ordered sequence, 
each video frame while playing the audio segment associated with the video frame and while 
displaying the foreign language text; 

displaying a lip enimciation indicator; and 

in response to selecting the lip enunciation indicator, determining a 
word in the displayed foreign language text and displaying the graphical representation of lips 
enunciating the determined word. 

14. A computer readable memory device for controlling the operation of a 
computer processor according to the method of claim 2, comprising: 

a stored audiovisual presentation for each of a plurality of words in a foreign 
language, the audiovisual presentation of each word having 

an audible pronunciation component; 
a visual text representation component; and 

a graphical representation component of lips enunciating the word. 
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such that a program controlling execution of the computer processor can retrieve the stored 
audiovisual presentation of a selected one of the plurality of words to concurrently play back 
the audible pronunciation of the selected word while displaying the visual text representation 
of the selected word and the lips enunciating the selected word. 

15. A method for creating a foreign language instruction audiovisual aid in 
a computer system memory, the method comprising the steps of: 

storing in the memory a plurality of video frames with corresponding audio 
segments of speech, the speech comprising a story in a foreign langxiage; 

storing in the memory a dialog balloon for each audio segment, each dialog 
balloon having visual foreign language text that corresponds to the speech of the audio 
segment; 

associated with each dialog balloon, storing in the memory a colloquial 
translation in the familiar language of the visual foreign language text of the dialog balloon; 
and 

for each word occurring in the stored audio segments of speech, storing an 
animated pronunciation guide, in such a manner that a program reading the memory can 
display the visual foreign language text of the word while playing an audio pronunciation of 
the word and while displaying the animated pronunciation guide for the word. 

1 6. A computer system for aiding foreign language instruction comprising: 

a speaker; 

a display device; 

a database having a plurality of stored words in a foreign language, each word 
having an associated stored audio representation of the pronunciation of the word, an 
associated visual text representation, and an associated graphical representation of lips that 
demonstrate the enunciation of the word; 

audiovisual display code that, in response to being invoked with an indicated 

word, 

retrieves from the database the stored audio representation, visual text 
representation, and graphical representation associated with the indicated word; and 

displays on the display device the retrieved graphical representation of 
lips that demonstrate the enunciation of the indicated word while displaying on the display 
the retrieved visual text and while playing on the speaker the retrieved audio representation; 
and 
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a selection mechanism that selects a word from among the plurality of foreign 
language words stored in the database, invokes the audiovisual display code indicating the 
selected word. 
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